-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove RWLock from EntryNotifier because it causes perf degradation #33797
Remove RWLock from EntryNotifier because it causes perf degradation #33797
Conversation
…hen entry notifications are enabled on geyser
…emove-rwlock-entry-notifier
@@ -8,4 +8,4 @@ pub trait EntryNotifier { | |||
fn notify_entry(&self, slot: Slot, index: usize, entry: &EntrySummary); | |||
} | |||
|
|||
pub type EntryNotifierLock = Arc<RwLock<dyn EntryNotifier + Sync + Send>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the unused RwLock to make sure clippy is happy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
…emove-rwlock-entry-notifier
@lijunwangs do these remaining buildkite steps need to pass? They seem to be pulling in the wrong dependencies and failing bc of that |
Those failures seem legitimate, and not related to dependencies:
|
…emove-rwlock-entry-notifier
Codecov Report
@@ Coverage Diff @@
## master #33797 +/- ##
=======================================
Coverage 81.9% 81.9%
=======================================
Files 809 809
Lines 218253 218252 -1
=======================================
+ Hits 178771 178821 +50
+ Misses 39482 39431 -51 |
Oh rip I must have missed a push. Updated, all passing now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Problem
When upgrading to 1.16 we observed that some of our geyser nodes could not keep up with the chain and had wild CPU fluctuations (running on 7443p). The first solution we found was to increase CPU (7543p), which sort of solved the issue. We profiled a local cluster to find any hotspots and found that the EntryNotifier was using significantly more CPU time than any other Geyser method. It seemed to be due to some thread contention from recv_timeout or the write lock is was obtaining. As a simple first step I removed the RWLock on EntryNotifier because it is unnecessary and re profiled. This removed the CPU overhead. I tested this change on MB with our geyser plugins and they are now stable on 7443p again.
Here is an example running yellowstone grpc plugin. This is from running a local cluster for about 10 minutes and shows he perf improvement. On MB the impact is much more severe, I didn't profile there though due to time/perf data size. Since a test of this change on MB fixed the issue I felt this was enough to illustrate the problem/solution.
With RWLock - 0.234% CPU time
Without RWLock - 0.147% CPU time
Summary of Changes
Remove RWLock from EntryNotifier